2 research outputs found
Continual learning from stationary and non-stationary data
Continual learning aims at developing models that are capable of working on constantly evolving problems over a long-time horizon. In such environments, we can distinguish three essential aspects of training and maintaining machine learning models - incorporating new knowledge, retaining it and reacting to changes. Each of them poses its own challenges, constituting a compound problem with multiple goals.
Remembering previously incorporated concepts is the main property of a model that is required when dealing with stationary distributions. In non-stationary environments, models should be capable of selectively forgetting outdated decision boundaries and adapting to new concepts. Finally, a significant difficulty can be found in combining these two abilities within a single learning algorithm, since, in such scenarios, we have to balance remembering and forgetting instead of focusing only on one aspect.
The presented dissertation addressed these problems in an exploratory way. Its main goal was to grasp the continual learning paradigm as a whole, analyze its different branches and tackle identified issues covering various aspects of learning from sequentially incoming data. By doing so, this work not only filled several gaps in the current continual learning research but also emphasized the complexity and diversity of challenges existing in this domain. Comprehensive experiments conducted for all of the presented contributions have demonstrated their effectiveness and substantiated the validity of the stated claims
Class-Incremental Mixture of Gaussians for Deep Continual Learning
Continual learning models for stationary data focus on learning and retaining
concepts coming to them in a sequential manner. In the most generic
class-incremental environment, we have to be ready to deal with classes coming
one by one, without any higher-level grouping. This requirement invalidates
many previously proposed methods and forces researchers to look for more
flexible alternative approaches. In this work, we follow the idea of
centroid-driven methods and propose end-to-end incorporation of the mixture of
Gaussians model into the continual learning framework. By employing the
gradient-based approach and designing losses capable of learning discriminative
features while avoiding degenerate solutions, we successfully combine the
mixture model with a deep feature extractor allowing for joint optimization and
adjustments in the latent space. Additionally, we show that our model can
effectively learn in memory-free scenarios with fixed extractors. In the
conducted experiments, we empirically demonstrate the effectiveness of the
proposed solutions and exhibit the competitiveness of our model when compared
with state-of-the-art continual learning baselines evaluated in the context of
image classification problems